Gesture and Speech for Video Content Navigation
نویسندگان
چکیده
This article describes ongoing research in the use computer vision gesture and speech recognition techniques as a natural interface for video content navigation, and the design of a navigation and browsing system that caters to these natural means of computer-human interaction. For consumer applications, video content navigation presents two challenges: (1) how to parse and summarize multiple video streams in an intuitive and efficient manner, and (2) what type of interface will enhance the ease of use for video browsing and navigation in a living room setting or an interactive environment. In this paper, we address the issues and propose the techniques that combine video content navigation with gesture and speech recognition, seamlessly and intuitively, in an integrated system. We present a new type of browser for browsing and navigating video content, as well as a gesture and speech recognition interface for this browser.
منابع مشابه
Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study
Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...
متن کاملOscillatory gestures and discourse
Gesture and speech are part of a single human language system. They are co-expressive and complementary channels in the act of speaking. While speech carries the major load of symbolic presentation, gesture provides the imagistic content. Proceeding from the established cotemporality of gesture and speech, we discuss our work on oscillatory gestures and speech. We present our wavelet-based appr...
متن کاملNeural Network Performance Analysis for Real Time Hand Gesture Tracking Based on Hu Moment and Hybrid Features
This paper presents a comparison study between the multilayer perceptron (MLP) and radial basis function (RBF) neural networks with supervised learning and back propagation algorithm to track hand gestures. Both networks have two output classes which are hand and face. Skin is detected by a regional based algorithm in the image, and then networks are applied on video sequences frame by frame in...
متن کاملThe development of iconicity in children's co-speech gesture and homesign.
Gesture can illustrate objects and events in the world by iconically reproducing elements of those objects and events. Children do not begin to express ideas iconically, however, until after they have begun to use conventional forms. In this paper, we investigate how children's use of iconic resources in gesture relates to the developing structure of their communicative systems. Using longitudi...
متن کاملSpeech recognition techniques for a sign language recognition system
One of the most significant differences between automatic sign language recognition (ASLR) and automatic speech recognition (ASR) is due to the computer vision problems, whereas the corresponding problems in speech signal processing have been solved due to intensive research in the last 30 years. We present our approach where we start from a large vocabulary speech recognition system to profit ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998